knitr document van Steensel lab

TF reporter barcode processing - pMT02 - stimulation 1

Introduction

18,000 TF reporters on pMT02 were transfected into mESCs and NPCs (in total 7 different conditions), sequencing data yielded barcode counts of these experiments. These counts will be processed in this script.

Analysis

Add barcode annotation to barcode counts & extract first bc read count information

Get a closer look at unmatched barcodes

Check if the pDNA-bc count correlates with the barcode count in the pDNA-insert-seq data

Conclusion barcode clustering:
- I manually added barcodes with high correlation and levenshtein distance of 1 to 1 barcode to get more reads

Compare differently clustered pDNA data

Data quality plots

Normalization of barcode counts:

Divide cDNA barcode counts through pDNA barcode counts to get activity

Calculate mean activity - filter out outlier barcodes

Calculate correlations between technical replicates

Data quality plots - correlation between replicates

## `geom_smooth()` using formula 'y ~ x'

## `geom_smooth()` using formula 'y ~ x'

## `geom_smooth()` using formula 'y ~ x'

Session Info

paste("Run time: ",format(Sys.time()-StartTime))
## [1] "Run time:  4.314425 mins"
getwd()
## [1] "/DATA/usr/m.trauernicht/projects/SuRE-TF/gen-1_stimulation-1"
date()
## [1] "Tue Jul 13 17:18:04 2021"
sessionInfo()
## R version 4.0.5 (2021-03-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] pheatmap_1.0.12             PCAtools_2.2.0             
##  [3] ggrepel_0.9.1               DESeq2_1.30.1              
##  [5] SummarizedExperiment_1.20.0 Biobase_2.50.0             
##  [7] MatrixGenerics_1.2.1        matrixStats_0.59.0         
##  [9] GenomicRanges_1.42.0        GenomeInfoDb_1.26.7        
## [11] IRanges_2.24.1              S4Vectors_0.28.1           
## [13] BiocGenerics_0.36.1         tidyr_1.1.3                
## [15] LncFinder_1.1.4             gridExtra_2.3              
## [17] RColorBrewer_1.1-2          readr_1.4.0                
## [19] haven_2.4.1                 ggbeeswarm_0.6.0           
## [21] plotly_4.9.4.1              tibble_3.1.2               
## [23] dplyr_1.0.7                 vwr_0.3.0                  
## [25] latticeExtra_0.6-29         lattice_0.20-41            
## [27] stringdist_0.9.6.3          GGally_2.1.2               
## [29] ggpubr_0.4.0                ggplot2_3.3.5              
## [31] stringr_1.4.0               plyr_1.8.6                 
## [33] data.table_1.14.0          
## 
## loaded via a namespace (and not attached):
##   [1] readxl_1.3.1              backports_1.2.1          
##   [3] lazyeval_0.2.2            splines_4.0.5            
##   [5] crosstalk_1.1.1           BiocParallel_1.24.1      
##   [7] digest_0.6.27             foreach_1.5.1            
##   [9] htmltools_0.5.1.1         fansi_0.5.0              
##  [11] magrittr_2.0.1            memoise_2.0.0            
##  [13] openxlsx_4.2.4            recipes_0.1.16           
##  [15] annotate_1.68.0           gower_0.2.2              
##  [17] jpeg_0.1-8.1              colorspace_2.0-2         
##  [19] blob_1.2.1                xfun_0.24                
##  [21] crayon_1.4.1              RCurl_1.98-1.3           
##  [23] jsonlite_1.7.2            genefilter_1.72.1        
##  [25] survival_3.2-10           iterators_1.0.13         
##  [27] glue_1.4.2                gtable_0.3.0             
##  [29] ipred_0.9-11              zlibbioc_1.36.0          
##  [31] XVector_0.30.0            seqinr_4.2-8             
##  [33] DelayedArray_0.16.3       BiocSingular_1.6.0       
##  [35] car_3.0-11                abind_1.4-5              
##  [37] scales_1.1.1              DBI_1.1.1                
##  [39] rstatix_0.7.0             Rcpp_1.0.7               
##  [41] viridisLite_0.4.0         xtable_1.8-4             
##  [43] dqrng_0.3.0               rsvd_1.0.5               
##  [45] foreign_0.8-81            bit_4.0.4                
##  [47] proxy_0.4-26              lava_1.6.9               
##  [49] prodlim_2019.11.13        htmlwidgets_1.5.3        
##  [51] httr_1.4.2                ellipsis_0.3.2           
##  [53] farver_2.1.0              pkgconfig_2.0.3          
##  [55] reshape_0.8.8             XML_3.99-0.6             
##  [57] nnet_7.3-15               locfit_1.5-9.4           
##  [59] utf8_1.2.1                caret_6.0-88             
##  [61] labeling_0.4.2            tidyselect_1.1.1         
##  [63] rlang_0.4.11              reshape2_1.4.4           
##  [65] AnnotationDbi_1.52.0      cachem_1.0.5             
##  [67] munsell_0.5.0             cellranger_1.1.0         
##  [69] tools_4.0.5               generics_0.1.0           
##  [71] RSQLite_2.2.7             ade4_1.7-17              
##  [73] broom_0.7.8               fastmap_1.1.0            
##  [75] evaluate_0.14             yaml_2.2.1               
##  [77] ModelMetrics_1.2.2.2      knitr_1.33               
##  [79] bit64_4.0.5               zip_2.2.0                
##  [81] purrr_0.3.4               sparseMatrixStats_1.2.1  
##  [83] nlme_3.1-152              compiler_4.0.5           
##  [85] beeswarm_0.4.0            curl_4.3.2               
##  [87] png_0.1-7                 e1071_1.7-7              
##  [89] ggsignif_0.6.2            geneplotter_1.68.0       
##  [91] stringi_1.6.2             highr_0.9                
##  [93] forcats_0.5.1             Matrix_1.3-2             
##  [95] vctrs_0.3.8               pillar_1.6.1             
##  [97] lifecycle_1.0.0           irlba_2.3.3              
##  [99] cowplot_1.1.1             bitops_1.0-7             
## [101] R6_2.5.0                  rio_0.5.27               
## [103] vipor_0.4.5               codetools_0.2-18         
## [105] MASS_7.3-53.1             withr_2.4.2              
## [107] GenomeInfoDbData_1.2.4    mgcv_1.8-34              
## [109] hms_1.1.0                 beachmat_2.6.4           
## [111] grid_4.0.5                prettydoc_0.4.1          
## [113] rpart_4.1-15              timeDate_3043.102        
## [115] class_7.3-18              DelayedMatrixStats_1.12.3
## [117] rmarkdown_2.9             carData_3.0-4            
## [119] pROC_1.17.0.1             lubridate_1.7.10